A Formal Proof of a Paradox Associated with Cohen's Kappa
نویسنده
چکیده
Suppose two judges each classify a group of objects into one of several nominal categories. It has been observed in the literature that, for fixed observed agreement between the judges, Cohen’s kappa penalizes judges with similar marginals compared to judges who produce different marginals. This paper presents a formal proof of this phenomenon.
منابع مشابه
High Agreement and High Prevalence: The Paradox of Cohen’s Kappa
Background Cohen's Kappa is the most used agreement statistic in literature. However, under certain conditions, it is affected by a paradox which returns biased estimates of the statistic itself. Objective The aim of the study is to provide sufficient information which allows the reader to make an informed choice of the correct agreement measure, by underlining some optimal properties of Gwet...
متن کاملDiagonal arguments and fixed points
A universal schema for diagonalization was popularized by N.S. Yanofsky (2003), based on a pioneering work of F.W. Lawvere (1969), in which the existence of a (diagonolized-out and contradictory) object implies the existence of a fixed-point for a certain function. It was shown that many self-referential paradoxes and diagonally proved theorems can fit in that schema. Here, we fi...
متن کاملThe meaning of kappa: probabilistic concepts of reliability and validity revisited.
A framework--the "agreement concept"--is developed to study the use of Cohen's kappa as well as alternative measures of chance-corrected agreement in a unified manner. Focusing on intrarater consistency it is demonstrated that for 2 x 2 tables an adequate choice between different measures of chance-corrected agreement can be made only if the characteristics of the observational setting are take...
متن کاملA comparison of Cohen’s Kappa and Gwet’s AC1 when calculating inter-rater reliability coefficients: a study conducted with personality disorder samples
BACKGROUND Rater agreement is important in clinical research, and Cohen's Kappa is a widely used method for assessing inter-rater reliability; however, there are well documented statistical problems associated with the measure. In order to assess its utility, we evaluated it against Gwet's AC1 and compared the results. METHODS This study was carried out across 67 patients (56% males) aged 18 ...
متن کاملValidating and reliability testing the descriptive data and three different disease diagnoses of the internet-based DOGRISK questionnaire
BACKGROUND The DOGRISK questionnaire is an internet-based ongoing study of canine nutrition, living environment, and disease. Here we aim to assess the performance of the questionnaire using data from the first three years in relation to some descriptive and disease variables. We used associated questions, official register records, test-retest repeatability, and email/mail contact with questio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Classification
دوره 27 شماره
صفحات -
تاریخ انتشار 2010